Virtual storage system

ABSTRACT

The invention concerns a system for saving data derived from a mainframe characterized in that it comprises a computer equipment including an input/output interface for exchanging data with the guest computer, said interface comprising a backup document reader/inscriber emulator, at least one intermediate storage device and a tape document reader/inscriber, the equipment further comprising a processor for transfer between the input/output interface or the intermediate storage device and the key-to-tape reader/inscriber, the system further including a supervisor comprising a storage unit for recording data concerning key-to-tape recordings of the computer equipment, and for controlling said computer equipment according to instructions coming from the guest computer.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a national phase filing of and claims the benefit ofpriority to International Application Number PCT/FR01/02420, filed Jul.24, 2001, entitled or “Systeme de Stockage Virtuel,” which translates to“Virtual Storage System”.

This application also relates to the following co-pendingapplications: 1) International Application Number PCT/FR01/02381, filedJul. 20, 2001, entitled or “Procede de Sauvegarde de DonneesInformatiques,” which translates to “Method for Saving Computer Data”;2) International Application Number PCT/FR01/01324, filed Apr. 27, 2001,entitled or “Système de sauvegarde et de restauration automatique dedonnees provenant d'une pluralite d'equipements hôtes en environnementheterogene” or “Backup and restore system for data derived from aplurality of host equipment in heterogeneous environment”.

The entire disclosure contained in each of the above-mentioned patentapplications is incorporated by reference as if set forth at lengthherein.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR

DEVELOPMENT

Not applicable

REFERENCE OF A “MICROFICHE APPENDIX”

Not applicable

FIELD OF THE INVENTION

This invention relates to the domain of storage of computer data, andmore specifically to storage on media such as large capacity cassettes,by remote equipment usually including a cassette manipulation robot.

BRIEF DESCRIPTION OF THE PRIOR ART

International Published Application No. WO9844423 discloses a computernetwork comprising a number of storage control units, each being coupledto a plurality of storage assemblies, the said assemblies comprising atleast one high capacity memory device (MSD). Each storage control unitmay be coupled to at least one host processing system and at least oneother storage control unit to control access of host processing systemsto high capacity memory devices. Several data copies are stored instorage assemblies that are geographically remote from each other, sothat any host can access any copy. Each storage control unit comprisesan interface with a host that emulates a high capacity memory deviceindependent of the type of storage device and an interface with a localstorage assembly that emulates a host independent of the host type.Hosts access stored data by means of virtual addressing. Storage controlunits make automatic backups and error corrections and protect backupcopies in write.

U.S. Pat. No. 5,809,511 discloses a system for transfer of data from ahost station and complementary equipment comprising cache memory androbot controlled backup support management equipment.

SUMMARY OF THE INVENTION

The purpose of the invention is to provide an improved backup systemthat can be used by a heterogeneous set of host computers connected to acommon non-specific backup equipment. Generally, the invention relatesto a system for the backup of data originating from a host computer[mainframe] characterised in that it comprises computer equipmentincluding an input-output interface for exchanging data with the hostcomputer, the said interface comprising a backup reader-inscriberemulator, at least one hard disk and a tape reader-inscriber, theequipment also comprising a processor for making transfers between theinput-output interface or the tape reader interface, and the tapereader-inscriber, the system also comprising a supervisor comprising amemory for saving information about records on the computer equipmenttape, and to control the said computer equipment as a function ofinstructions originating from the host computer.

Advantageously, the emulator is composed of a computer for analysingsignals originating from the host computer and for generating a responsecorresponding to the type of simulated cassette reader-inscriber.

The invention also relates to a process for backing up data from a hostcomputer [mainframe] characterised in that the input-output interface ofa backup equipment is emulated so that behaviour of the backup equipmenttowards the host machine is identical to a streamer, the said backupequipment comprising an intermediate storage means that is not astreamer.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be better understood after reading thedescription given below of a non-limitative example of the embodimentwith reference to the appended drawings in which:

FIG. 1 shows the principle diagram of the present invention.

FIG. 2 shows an aspect of the present invention constructed according tothe teachings herein.

DETAILED DESCRIPTION OF THE INVENTION

The system described in the following is used to backup data originatingfrom a heterogeneous set of “mainframe” type host machines (1) connectedto an SCSI type computer network (2).

The backup equipment (3) comprises one or several streamers (4) forbacking up data on a magnetic medium.

It is connected to the network through an emulated input-outputinterface circuit (5) such that the backup equipment (3) is seen by thehost machine in the form of an emulated type streamer, for all functionsperformed by the backup equipment (3). The emulated interface emulatesthe main known streamers, to enable a transparent dialogue between thehost machine and the backup equipment (3).

The backup equipment (3) also comprises at least one intermediatestorage device (9) composed of RAID hard disks in the described example.

The backup equipment includes initiators (6, 7) for each of the backupmedia. A computer controls the different resources to transfer data fromthe input-output interface (5) to the intermediate storage device (9)and vice versa, and to transfer data from the intermediate storagedevice (9) to streamers (4) and vice versa.

Seen from the host machine, the backup equipment according to theinvention satisfies the following specifications:

It has exactly the same behaviour as the streamer that it replaces.

It improves the data storage speed through a disk cache. Data are storedon a disk partition, in order to accelerate backing up and restoring thedata. Data access is improved by means of a metamodel of backed up datathat memorises the data mapping. This metamodel enables direct access tosequentially stored data.

It copies the data onto a streamer. Data backed up on the disk partitionare copied onto the tape, reproducing the initial write mechanism byusing the model.

It enables persistence and coherence of the data. At the end of thebackup, the backup equipment guarantees the persistence and coherence ofdata on the tape and in the partition. It also makes it possible todecorrelate the upstream streamer type (that is being emulated) from thedownstream streamer (that is actually being controlled). On the upstreamside, the backup equipment manages one streamer model, and backs up dataon a another streamer model.

The backup equipment (3) makes the following connection types:

on the upstream side: SCSI, FC, ESCON, Bus&Tag

on the downstream side: SCSI, FC.

The backup equipment manages several connections on the upstream anddownstream sides simultaneously. Consequently, it executes severaltransfers in parallel. Each transfer is managed by a transfer unit.

A transfer unit manages three types of links:

link with a host system

link with a partition of a physical disk

link with the streamer.

The system also comprises a supervisor station (12) connected throughserial links (13, 14) firstly to the host machine and secondly to thebackup equipment.

The emulation consists of simulating the SCSI operation of a streamerwith regard to a host machine and managing the SCSI responses to thedifferent requests from the host and backup transfers.

The supervision station (12) controls a database in which theidentification labels of the backed up data are stored.

The data volumes written by host machines are initially created in abuffer disk space (9). The maximum size of these volumes is fixed at thetime of the configuration of the backup system, and is usually fairlysmall—of the order of 250 Mbytes. Secondly, one or several copies of thevolumes are transferred onto cartridges. Only the actually meaningfuldata are transferred to tape. Thus, for example, a maximum volume of 250Mbytes may only actually contain 10 Mbytes of data. In this case, onlythese 10 Mbytes are transferred to tape, in order to optimise tapespace.

The backup equipment uses a data base to internally manage the list ofknown volumes, by storing a certain amount of information such as:

the name of the volume

the medium on which it is stored (disk, cartridge)

the position on the medium (disk partition number, or logical start andend addresses on the cartridge)

etc.

This information is essential to be able to find a volume.

At the time that data are transferred from the disk cache to cartridges,private data called “Basic data” are added, at the end of the transferof each volume. These data are only written onto the cartridges, and areignored during transfers in the reverse direction, in the case in whicha volume is transferred from a cartridge to the disk cache, for exampleto be restored by the host machine. Therefore, they are entirely managedinternally by the backup equipment according to the invention andtransparently for host machines.

The basic data for a given volume are written in the form of an ASCIIcharacter string with the following structure:

-   -   Title CR LF VolumeStartposition VolumeEndposition VolumeSize        ReaderChannel/    -   DiskChannel DiskPartition ProcessorNumber    -   BarCode CartridgeName    -   CartridgeType SizeUsed CartridgeSize    -   LoadCounter VolumeName VolumeStatus HostCode CodingType    -   Writedate Writetime Readdate Readtime    -   EmptyDate EmptyTime CR LF

Title: title indicating the meaning of the following main fields inabbreviated form.

-   -   CR: ASCII character code 0x13 (hexadecimal)    -   LF: ASCII character code 0x10 code (hexadecimal)        -   VolumeStartPosition: logical address of the start of the            volume on the cartridge.        -   VolumeEndPosition: logical address of the end of the volume            on the cartridge        -   VolumeSize: approximate size of the volume in kbytes.        -   ReaderChannel: number of the reader (defined in the HBS            configuration) used to make the transfer from the disk cache            volume to the cartridge        -   DiskChannel: number of the disk (defined in the HBS            configuration) in which the volume is located at the time            that it is transferred to the cartridge.        -   DiskPartition: number of the disk partition in which the            volume is located before it is transferred to the cartridge.        -   ProcessorNumber: number of the processor used to transfer            the volume from the disk cache to the cartridge.        -   BarCode: bar code of the cartridge containing the volume.        -   CartridgeName: cartridge name, as declared under HBS. This            name is independent of the bar code.

CartridgeType: hexadecimal code indicating the cartridge type. Thepossible values are as follows: 0x0000001L operating cartridge0x00000010L cartridge with read access 0x00000020L cartridge with writeaccess 0x00000080L cartridge being reorganised 0x00000100L cartridge tobe reorganised 0x00000200L cartridge not to be reused 0x00000400Lblocked empty cartridge 0x00000800L reorganised cartridge 0x00001000Larchive type cartridge 0x00002000L mirror type cartridge 0x00010000Lcartridge for DLT reader 0x00020000L cartridge for Exabyte reader0x00040000L cartridge for 3480 reader 0x00080000L cartridge for 3590reader 0x01F00000L mask for number of the archive pool or mirror towhich the cartridge belongs.

The code used for the CartridgeType field may possibly be a combinationof the previous values.

-   -   SizeUsed: total size of data stored on the cartridge, in        Megabytes.    -   CartridgeSize: maximum capacity of the cartridge, in MegaBytes.    -   LoadCounter: cartridge load counter. Indicates the number of        times that the cartridge was loaded in a reader. These data are        used to determine cartridge wear.    -   VolumeName: volume name, as it is known by the host machine.

VolumeStatus: hexadecimal code indicating the volume status. This codeis a combination of indicators for which the access masks and possiblevalues are as follows: 0x0000001L 1 if the volume is valid, and 0 if itis invalid (old version or logically erased volume) 0x0000008L 1 if thevolume is of the mirror type 0x00000010L 1 if the volume has a mirrorcopy on another cartridge 0x00000020L 1 if a copy of this volume is tobe made on a mirror cartridge 0x00001000L 1 if the volume is of thearchive type 0x00002000L 1 if the volume is shared between several hostsystems 0x00010000L 1 if the volume must always be copied on DLTcartridges 0x00020000L 1 if the volume must always be copied on Exabytecartridges 0x00040000L 1 if the volume must always be copied on 3480cartridges 0x00080000L 1 if the volume must always be copied on 3590cartridges. 0x01F00000L number of the archive pool or mirror (from 0 to31)

-   -   HostCode: number of the host machine to which the volume        belongs, in the HBS configuration.    -   CodeType: character code used in the volume header (0=ASCII,        1=EBCDIC)    -   WriteDate: date of the most recent write or modification of the        volume by the host machine, in the form yyyy-mm-dd    -   WriteTime: time of the most recent write or modification of the        volume by the host machine, in the form hh:mm:ss    -   ReadDate: date of the most recent read access of the volume by        the host machine, in the form yyyy-mm-dd    -   ReadTime: time of the most recent read access of the volume by        the host machine, in the form hh:mm:ss    -   EmptyDate: date on which the disk cache volume was transferred        to the cartridge, in the form dd-mm-yyyy    -   EmptyTime: time at which the disk cache volume was transferred        to the cartridge, in the form hh:mm:ss

Basic data are cumulative, in order to accelerate the analysis ofcartridges in order to reconstruct the database.

Referring now to FIG. 2, assume that a tape contains volumes V1, V2, V3,V4 and V5. The basic data associated with each of these volumes arecalled B1, B2, B3, B4 and B5. Therefore, on the tape, the basic data B1only contain data related to volume B1. The basic data B2 contain theaccumulated data for B1 and data about volume V2 in a single datarecord. Therefore B2 contain data for V1 and V2.

Basic data B3 contain the accumulated data for B2 and data about volumeB3 in a single data record. Therefore B3 contains data for V1, V2 andV3.

Therefore the final basic data on the cartridge, B5 in the previousexample, contain an accumulated total of all data about all volumespresent on the cartridge.

If a cartridge contains a very large number of volumes, the accumulatedbasic data may be large. In order to limit this increase in size, amaximum size has been arbitrarily fixed at 132 kbytes. When the standardconstruction of basic data for a volume exceeds 132 kbytes, theequipment (3) assigns reduced basic data to this volume, to contain onlybasic data for this new volume without accumulating data for previousvolumes. For subsequent volumes, the standard mechanism for accumulatingdata for the current volume with data for the previous volume will berepeated.

If the database in the system is lost completely, the base can becompletely reconstructed using these basic data. An integrated functionin the processor code is used to analyse a cartridge to extract the mostrecent basic data from it. This analysis may also be done by an externalsoftware; all that is necessary is to move to the end of the tape, to goback one record and read the last data record. The basic data thusretrieved at the end of the cartridge contain a description of thevolumes on the cartridge. As described in a previous paragraph, if theVolumeaddress field in the first volume contains a value not equal tozero, then the first volume is not at the beginning of the tape. Theconclusion is that the basic data are reduced. In this case, all that isnecessary is to go to the cartridge at the address Volumeaddress, andthen work backwards from the record to be able to read the basic datafor the previous volume. These data are an accumulation of the basicdata for the previous volumes.

The backwards analysis of the cartridge must be continued until thebasic data with the address Volumeaddress equal to 0 are found for thefirst volume. All volumes on the cartridge may then be found byaccumulating all retrieved basic data.

The base is reconstructed by retrieving all basic data stored on allcartridges in the library, and then using an appropriate software toanalyse them. All these data include all data necessary to reconstructthe base. To do this, the first step is to have a list of all volumescontained on all cartridges, and also to determine whether or not eachvolume of a cartridge is valid for the host machine. The same volume(same name, same host system) may be present on several differentcartridges, or at several locations on the same cartridge. This canoccur for the following reasons:

either they are several different versions of the same volume that wasupdated by the host machine several times,

or they are the same data that were moved internally by HBS. In allcases, an analysis of the Writedate and Writetime basic data for alloccurrences of this volume may be used to determine which is the mostrecent and therefore the only one that is valid. If the most recentversion is present in several locations (same Writedate and Writetimeinformation), any of these occurrences can be used to become the validversion of the volume in the new base. All that is necessary then is torecreate an empty database and fill in all the tables using thecollected information.

1. A method for saving data originating from a host computer composed ofa computer equipment including an input-output interface for exchangingdata with the host computer, the said interface comprising a backupreader-inscriber emulator, at least one intermediate storage device anda tape reader-inscriber, the equipment also comprising a processor formaking transfers between the input-output interface or the intermediatestorage device interface and the tape reader-inscriber, wherein thesystem also comprises a supervisor comprising a memory for savinginformation about records on the computer equipment tape, and to controlthe said computer equipment as a function of instructions originatingfrom the host computer, and a memory for making use of a databasecontaining identification labels of the backed up data.
 2. A method forsaving data according to claim 1, wherein the emulator is composed of acomputer for analysing signals originating from the host computer andfor generating a response corresponding to the type of simulatedcassette reader-inscriber.
 3. A method for saving data according toclaim 1, wherein the intermediate storage device is composed of at leastone hard disk.
 4. A method for saving data according to claim 2, whereinthe intermediate storage device is composed of at least one hard disk.5. A method for saving data according to claim 3, wherein the numericdata forming the identification labels include the volume name, themedium on which it is stored and the position on the medium.
 6. A methodfor saving data according to claim 4, wherein the numeric data formingthe identification labels include the volume name, the medium on whichit is stored and the position on the medium.
 7. A method for saving dataaccording to claim 1, wherein the supervisor station is connected to thebackup equipment and to the host machine through serial links.
 8. Amethod for saving data according to claim 2, wherein the supervisorstation is connected to the backup equipment and to the host machinethrough serial links.
 9. A method for saving data according to claim 3,wherein the supervisor station is connected to the backup equipment andto the host machine through serial links.
 10. A method for saving dataaccording to claim 4, wherein the supervisor station is connected to thebackup equipment and to the host machine through serial links.
 11. Amethod for saving data according to claim 5, wherein the supervisorstation is connected to the backup equipment and to the host machinethrough serial links.
 12. A method for saving data according to claim 6,wherein the supervisor station is connected to the backup equipment andto the host machine through serial links.
 13. A method for saving dataaccording to claim 1, wherein the backup equipment is connected to thehost machine through an SCSI or FC type link.
 14. A method for savingdata according to claim 2, wherein the backup equipment is connected tothe host machine through an SCSI or FC type link.
 15. A method forsaving data according to claim 3, wherein the backup equipment isconnected to the host machine through an SCSI or FC type link.
 16. Amethod for saving data according to claim 4, wherein the backupequipment is connected to the host machine through an SCSI or FC typelink.
 17. A method for saving data according to claim 5, wherein thebackup equipment is connected to the host machine through an SCSI or FCtype link.
 18. A method for saving data according to claim 6, whereinthe backup equipment is connected to the host machine through an SCSI orFC type link.
 19. A method for saving data according to claim 7, whereinthe backup equipment is connected to the host machine through an SCSI orFC type link.
 20. A method for saving data according to claim 8, whereinthe backup equipment is connected to the host machine through an SCSI orFC type link.
 21. A method for saving data according to claim 9, whereinthe backup equipment is connected to the host machine through an SCSI orFC type link.
 22. A method for saving data according to claim 10,wherein the backup equipment is connected to the host machine through anSCSI or FC type link.
 23. A method for saving data according to claim11, wherein the backup equipment is connected to the host machinethrough an SCSI or FC type link.
 24. A method for saving data accordingto claim 12, wherein the backup equipment is connected to the hostmachine through an SCSI or FC type link.
 25. A method for saving datafrom a host computer wherein the input-output interface of a backupequipment is emulated so that the behaviour of the backup equipment isidentical to the behaviour of a streamer, as far as the host machine isconcerned, the said backup equipment comprising an intermediate storagemeans that is not the same as the streamer.